Assessing targeted invitation and response modes to improve survey participation in a diverse New York City panel: Healthy NYC

Background Healthy NYC is an innovative survey panel created by the New York City (NYC) Department of Health and Mental Hygiene (DOHMH) that offers a cost-effective mechanism for collecting priority and timely health information. Between November 2020 and June 2021, invitations for six different surveys were sent to Healthy NYC panelists by postal mail, email, and text messages. Panelists had the option to complete surveys online or via paper survey. Methods We analyzed whether panelists varied by sociodemographic characteristics based on the contact mode they provided and the type of invitation that led to their response using logistic regression models. Poisson regression models were used to determine whether the number of invitations received before participating in a survey was associated with sociodemographic characteristics. Results Younger age and higher education were positively associated with providing an email or text contact. Furthermore, age, race, and income were significant predictors for invitation modes that led to a survey response. Black panelists had 72% greater odds (OR 1.72 95% CI: 1.11–2.68) of responding to a mail invite and 33% lesser odds (OR 0.67, 95% CI: 0.54–0.83) of responding to an email invite compared with White panelists. Additionally, in five of the six surveys, more than half of the respondents completed surveys after two invites. Email invitations garnered the highest participation rates. Conclusions We recommend using targeted invitation modes as an additional strategy to improve participation in panels. For lower-income panelists who do not provide an email address, it may be reasonable to offer additional response options that do not require internet access. Our study’s findings provide insight into how panels can tailor outreach to panelists, especially among underrepresented groups, in the most economical and efficient ways.


Introduction
Healthy NYC is an innovative local survey panel created by the New York City (NYC) Department of Health and Mental Hygiene's (DOHMH) Division of Epidemiology in the spring of 2020. Healthy NYC provides a cost-effective mechanism for providing priority and timely health information [1]. Since the panel's inception, Healthy NYC surveys have primarily focused on experiences during and consequences of the COVID-19 pandemic. They have provided information on the prevalence of COVID-like illness [2]; opinions about treatment, testing, and vaccination; awareness of governmental recommendations; mental health [3] and financial consequences of the pandemic; and associated racial inequities [4].
While Healthy NYC is a probabilistically constructed panel, encouraging participation in Healthy NYC surveys in ways that maintain representativeness remains a key concern. This is a particular concern in health-related surveys because non-respondents reportedly have a higher prevalence of risky health behaviors and poor self-rated health [5][6][7][8][9][10]. Previous research has found that the type of contact information provided [11,12] and the likelihood of responding to surveys are associated with sociodemographic characteristics [13][14][15][16]. This suggests that inviting panelists using multiple forms of contact information and offering multiple response modes may increase a survey's representativeness. In addition, the number of invitations panelists receive for each survey may motivate them differently and fine-tuning the contact frequency may help to encourage participation. There is currently no consensus in the literature on the optimal number of email invite reminders; however, the first few invites generally garnered the most responses [17][18][19][20][21].
As of April 2021, Healthy NYC was composed of 9,583 NYC residents 18 years and older who had been enrolled through two recruitment efforts in June and September 2020. This paper investigates three questions. First, do panelists vary by their sociodemographic characteristics based on the types of contact information they provide? Second, are panelists' sociodemographic characteristics associated with their responses to invitations received by mail, text, and email? Third, do panelists vary by sociodemographic characteristics and the number of invitations received before participating in a survey? Understanding these components can help us produce targeted invitation systems that result in higher participation rates, more representative surveys, and more satisfied panelists.

Data collection and sampling
Non-institutionalized NYC residents ages 18 and older were probabilistically selected to be invited to join Healthy NYC, primarily from address-based samples, supplemented by those who had previously completed DOHMH probability-based surveys between 2019 and 2020 and agreed to be contacted for future research [1]. In the address-based samples, households were randomly sampled from a list of all residential addresses in NYC. In the samples from respondents to other DOHMH surveys, those respondents had been selected for the initial DOHMH surveys through Random Digit Dialing (RDD) of landline and cell phone numbers, or though random selection of municipal administrative records. To obtain intra-household randomization, the next birthday method was used. This meant that the person opening the invitation was asked to have the adult member of the household who would have the next birthday complete the panel registration survey.
For the first recruitment effort (6/8/2020-7/30/2020), residents from a total of 30,000 NYC addresses, and 6,516 people who had taken a previous DOHMH survey and provided a phone number and/or email address were invited to participate in Healthy NYC (Fig 1). In the second recruitment effort (09/01/2020-12/8/2020), residents from 30,000 NYC addresses and 2,571 people who had taken a previous DOHMH survey and provided a phone number and/or email address were invited to participate in Healthy NYC (Fig 1). Panelists could provide an address for mail, an email address for email contact, and a cell phone number and consent to receive text messages in the Registration survey. Fig 2 shows the frequencies and the different combinations of the three contact modes that panelists provided. At the time of the analysis (8/13/2021), a total of 9,524 panelists remained from the 9,583 initially registered after the removal of 45 withdrawals and 14 individuals who did not provide any contact information at registration.
Once enrolled, panelists can be invited to up to 10 surveys yearly. We included six surveys in this analysis: Social Determinants of Health (SDH), Emotional Wellness Survey (EWS), two waves of Health Opinion Polls (HOP), and two NYC Population Health Surveys for COVID-19 (COVID). For each survey, stratified random sampling was used to select a sample from Healthy NYC. Sampled panelists were invited by various combinations of postal mail, email, or text messages, depending on the types of contact information they provided at registration (Fig 2), and we called this the invitation mode(s).
Panelists were recruited into Healthy NYC in two waves, in June 2020 and September 2020. In both waves, most were recruited from a random Address Based Sample (ABS), supplemented by individuals who had completed previous DOHMH surveys and consented to be recontacted for future surveys. Those who completed the Registration survey were considered to be enrolled in the panel.
The surveys were self-administered online or by completing a mailed paper questionnaire. Due to funding limitations, not all panelists who provided a mailing address received a mailed invitation or a paper survey. Instead, panelists were divided into age tertiles (oldest, older, and youngest); those in the oldest category and those who did not provide an email or phone contact (that is, only provided a mailing address) were prioritized to receive mailed invitations. They first received one or two push-to-web invitations, and non-respondents received a paper survey in the final mailing. The surveys were offered in English, Spanish, Simplified and Traditional Chinese (Mandarin and Cantonese by phone), and Russian. Panelists were offered a $20 gift card for completing the June or September registration surveys and $5 for all other surveys.
There were 8,802 unique panelists sampled at least once across the six surveys; each was sampled up to five times. A total of 4,729 unique panelists responded to at least one survey.

Sociodemographic variables
Seven sociodemographic variables were used in the analyses: borough of residence, sex assigned at birth, age, race/ethnicity, education, household income, and nativity status. Sex assigned at birth was categorized into male or female; those who answered something other than male or female were excluded from comparative analyses since the small number did not allow for meaningful comparison (<1%). Age was categorized into four groups (18-24, 25-44, 45-64, and 65+). For race/ethnicity, we combined questions on self-reported race and Latino ethnicity into the following categories: White, Black, Latino, Asian or Pacific Islander (API), and American Indian /Multiple/Something Else (AIMSE). Latino includes people of Hispanic or Latino origin, as identified by the survey question "Are you Hispanic or Latino/a?". Those who identified as Latino/a were excluded from the other race categories. Educational attainment was categorized into four levels (less than high school, high school graduate, some college, and college graduate or higher). Household income level was defined using household Federal Poverty Level (FPL) thresholds of <200% and �200%. Nativity was classified as being born in the U.S. or U.S. territories or being born outside of the U.S. Missing values were excluded from the analyses. Types of contact information provided by panelists and types of invitations sent to panelists. The types of initial invitations and reminders that sampled panelists received for any given survey were determined by the types of contact information they provided when they registered for Healthy NYC. � Panelists who provided a cellphone number were asked if this was a smartphone that they could use to take surveys and were then asked to consent to receive text messages. † Due to budgetary and operational limitations, not all panelists who provided a mailing address received mailed paper surveys or push-to-web invitations if they provided any other contact information. https://doi.org/10.1371/journal.pone.0280911.g002

Statistical analysis
To determine whether panelists varied based on their sociodemographic characteristics and the contact information they provided as well as their response to the type of invitations (paper surveys, push-to-web, email, or text), we used Chi-Square tests to identify bivariate associations. This determined potential predictors for the final logistic regression models (LRM). For the contact mode analysis, there were three outcomes of interest: whether panelists provided an email address (yes/no), a mailing address (yes/no), and a smartphone number with consent to receiving text messages (yes/no).
For the analysis of which invitation mode led to a response, there were five outcomes of interest where panelists responded to: a mailed paper survey, a mailed push-to-web letter, an email, a text, or multiple invitation modes (if panelists responded to different modes of invites across surveys). The analysis for each LRM was limited to panelists who were invited at least once by the specific invitation mode. Furthermore, each LRM was restricted to panelists who were invited by at least one additional mode other than the one that led to their response.
To determine whether the number of invitations received before participating in a survey was associated with sociodemographic characteristics, we ran Poisson regression models. The dependent variable was the total number of invitations that each respondent received for each of the six surveys before they responded, and the independent variables were sociodemographic variables. Since panelists who received mail invitations received more invitations regardless of their response status, we did an additional subgroup analysis excluding panelists who received a mail invite. We did not include invitees who never responded to a survey for either analysis. Statistical significance was defined as p<0.05. We assessed whether there was multicollinearity between the seven sociodemographic variables. No two variables were multicollinear based on the variance inflation factor and tolerance diagnostic criteria. The final LRM and the chosen predictors were based on the lowest Akaike Information Criterion score and best Receiver Operating Characteristic curve to minimize model overfitting. Analyses were done using SAS Enterprise Guide 7.1.

Ethics statement
The protocol for Healthy NYC was approved by the NYC DOHMH Institutional Review Board (IRB). When participants enrolled in Healthy NYC and at the start of each subsequent survey, there was consent language explaining the survey, confidentiality, and privacy protections, that participation was voluntary and that surveys could be stopped at any time. We received a waiver of documentation of informed consent from the NYC Health Department's IRB because the research was of minimal risk and written consent was not logistically feasible. Instead, participants' consent was documented in the following three ways: by checking a box on the web survey, by completing a paper survey, or verbally on the phone with documentation by the interviewer.
The data underlying the findings can be accessed through a Data Use Agreement (DUA) with the NYC DOHMH. Readers interested in accessing the data should contact EpiDataRe-quest@health.nyc.gov to engage in a DUA and receive the data.

Results
The sociodemographic characteristics of Healthy NYC panelists are described in Table 1.
Among the 9,583 panelists, 9,250 (96.5%) provided a mailing address, 8,672 (90.5%) provided an email address, and 5,447 (56.8%) provided a smartphone number and consented to receive texts. We found that certain sociodemographic characteristics predicted the likelihood of providing a mailing address, smartphone number, or email address. Younger age, higher household income, and higher educational attainment were positively associated with providing an email address (Table 2), and younger age and higher educational attainment were positively associated with providing a smartphone number and permission to text (Table 3).
Among the 4,729 respondents who completed one or more surveys, 4,419 were invited by email, 2,413 were invited by text, 1,150 were invited by mail and sent a paper survey with a push-to-web option, 1,674 received a mail invite with only a push-to-web option (no paper survey), and 3,442 received multiple invitation modes. We found that 85% of survey participants responded exclusively to a single invite mode across the six surveys despite most (73%) receiving invitations via multiple modes. The greatest proportion of panelists (54%) responded exclusively to email invitations (Fig 3). Age, race, and household income significantly predicted the specific invitation modes that led to the panelists' survey responses (Tables 5-8  Those with higher household income had 53% lesser odds of responding to a paper survey (OR: 0.47 95% CI: 0.32-0.68) and 124% greater odds of responding to a push-to-web letter invite (OR: 2.24, 95% CI: 1.74-2.87) compared to those with lower household income.
Finally, across the six surveys, panelists received between one and seven invites for any given survey (Fig 4). In five of the six surveys, more than half of the respondents completed surveys after one or two invites, except for the June HOP. Email invitations resulted in the highest number of responses (Fig 5). The participation rate shows a slight decline over this time-period. However, it should be noted that March and June HOP had a two-week data collection period, as opposed to four weeks for other survey months. There was no association between panelists' sociodemographic characteristics and the number of invitations they received before participating in a survey (Table 9 and S1 Table). In the Poisson models that include all respondents, age was associated with the number of invitations sent before eliciting a response. Across all six surveys, respondents 65+ were sent significantly more invitations before responding compared to those ages 25-44, and respondents 45-64 were sent significantly more invitations compared to those ages 25-44 in four out of six surveys. However, when respondents invited by mail were excluded from the analysis, age was no longer a significant predictor of the number of invitations sent before eliciting a response to most surveys (Table 9 and S1 Table).

Discussion
In our six surveys, we found that certain sociodemographic variables predicted panelists' likelihood of providing certain types of contact information. Additionally, some sociodemographic

PLOS ONE
Assessing targeted invitation and response modes to improve survey participation in a diverse panel variables predicted panelists' likelihood of responding to a survey invitations received by mail, text, or email. However, sociodemographic variables did not predict the number of invitations needed in order to provide a survey response. Based on our findings, it is imperative to offer mixed-mode survey invitations to ensure panelist engagement and participation among respondents with a range of sociodemographic characteristics; this finding is consistent with past studies [22][23][24][25]. An important issue in survey research is ensuring representation from all population groups, including those who are traditionally more difficult to reach in surveys [26,27]. Strategies, such as oversampling and using non-probability frames, have been cited as potential solutions [28][29][30]. Although there is research on using various targeted invitation designs to improve survey participation [31][32][33], our findings can help other researchers develop an invitation strategy tailored to the subpopulations being surveyed in panels and identify response patterns. For example, past studies have found that Black survey invitees are more likely to complete surveys by mail compared to the web, as our findings corroborate [34,35]. We recommend using targeted invitation modes as an additional strategy to improve participation when needed. Yet, it is noteworthy that 85% of our active panelists consistently responded to the same type of invitation and may, therefore, only require that specific type of invitation in the future. In our study, panelists with lower household incomes were more likely to respond to a mailed paper survey and less likely to respond to a push-to-web invite than panelists with higher-income households. Even though internet access and usage are widespread among the general population, inequities surrounding internet access remain. Some panelists may have limited or no access and, therefore, may not be able to complete a survey online. Approximately 12.4% of NYC households do not have internet access [36], and neighborhoods with a higher proportion of racially and ethnically marginalized groups, households with lower household income, higher poverty levels, and residents with lower educational attainment have more limited internet access [37]. Although there was no difference between panelists with different household incomes in their propensity to respond to an email invite, we found that panelists with higher household incomes were much more likely to provide an email contact compared to panelists with lower household incomes. Therefore, it may be reasonable to provide lower-income panelists who do not provide an email contact with additional response options not requiring internet access (for example, phone or paper). Lastly, we found that the number of invitations sent before receiving a response is likely not predicted by sociodemographic characteristics. In the analyses that included mail invitations, panelists who were 45-64 or 65+ years old were sent more invitations before responding across most or all of the six surveys. However, once mailed invitations were excluded from the analysis, there was no longer a consistent association between age and the number of invitations required to elicit a response. This difference in the findings can be attributed to three issues related to the design of the survey administration.
First, older panelists were prioritized to receive mailed push-to-web and follow-up paper surveys based on the assumption that older adults would be less likely to respond to email and text invitations [35,[38][39][40]. Second, since the push-to-web and paper surveys were distributed by a vendor that required up to two weeks to prepare, print, and mail the materials, there was limited opportunity to remove individuals from reminder mailing lists once they responded to the survey. Therefore, unlike with web responses, we were unable to remove persons who responded to a mail invite from receiving future paper invitations so these people could have received an invitation after survey completion. Third, participants were told that they would receive a paper survey in the mail towards the end of the fielding period if needed; some panelists may have waited to receive the paper survey to respond. In summary, since recipients of push-to-web and mailed paper surveys were sent at least two invites regardless of response status and may have waited to get a paper survey, they likely received more invitations because of this operational arrangement.
Our study has some limitations. This analysis includes only six survey months and analyzing additional surveys across a longer period can further validate the findings. Furthermore, in an effort to balance both budgetary constraints and operational needs, especially during the COVID-19 pandemic, we were not able to randomize the types of invitations that panelists received. A randomized experiment that included only panelists who provided all three types of contact information would have allowed us to draw stronger inferences about the association between panelists' sociodemographic characteristics and their likelihood of responding to various types of invitations. Instead, we included individuals who provided at least one type of contact information and we sent invitations based on the contact modes panelists provided, and the limited mailed invitations were prioritized to older panelists and those who provided only a mailing address for contact. Past research has shown that older survey respondents are more likely to respond to mailed paper surveys than online surveys [35,[38][39][40]. These studies corroborate our findings that-even among those who provided email and/or text contact information-older panelists were less likely to respond to email and text invitations than younger panelists. Strategies to maintain an engaged panel that is representative of the NYC population are crucial in improving the accuracy of public health surveillance data. The data collected through Healthy NYC provide guidance to public health leaders for developing evidence-based policies and programs to improve the health and wellness of NYC residents, and it is essential that these data are representative of the population being served. As the Healthy NYC panel matures, we plan to increase representativeness of each survey by targeting invitations based on panelists' preferences, as described in survey questions about their preferred way to be contacted, as well as past response patterns. We will also refine our invitation approaches by assessing whether certain combinations of invitation modes lead to better participation rates within certain sociodemographic subgroups of the panel. For example, an initial email invitation followed by a text reminder may be more motivating for some populations than an initial text invitation followed by an email reminder.
In conclusion, this research provides valuable insight into a diverse probability-based survey panel and the potential benefits of using a targeted invitation strategy. In a city agency with limited resources, improving survey response through targeted invitations in panels can be cost-effective and efficient.

S2 Table. Raw tables of the six poisson regression models (email and text invitations only).
(DOCX)